Percolation and Epidemic Thresholds in Clustered Networks 
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We develop a theoretical approach to percolation in random clustered networks. We find that, 
although clustering in scale-free networks can strongly affect some percolation properties, such as 
the size and the resilience of the giant connected component, it cannot restore a finite percolation 
threshold. In turn, this implies the absence of an epidemic threshold in this class of networks 
extending, thus, this result to a wide variety of real scale-free networks which shows a high level of 
transitivity. Our findings are in good agreement with numerical simulations. 
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Perhaps one of the main reasons for the growing inter- 
est in complex networks is that, indeed, many systems in 
the real world, either naturally evolved or artificially de- 
signed, are organized in a networked fashion 0,13 ■ This 
makes any theoretical approach potentially applicable to 
many different fields in the short term. As a germane 
example, percolation on networks has been one of these 
theoretical advances which has helped to understand, for 
instance, the high resilience of scale-free (SF) networks 
in front of the removal of a fraction of their constituents, 
with important implications for communication systems 
like the Internet and other Peer-To-Peer networks 3]. 

In addition to its high theoretical interest, percola- 
tion theory serves as a conceptual framework to treat 
more factual problems on networks, such as the dynam- 
ics of epidemic spreading Q. Indeed, the susceptible- 
infected-removed (SIR) model of epidemic spreading can 
be mapped into a bond percolation problem |^ | M B, B . 
This is one of the simplest models in the literature |9|,ll0j, 
with three different states for the elements of the popu- 
lation: susceptible, infected, and removed. In its bare 
formulation, it is characterized by the time that an in- 
dividual remains infected and the time that an infected 
individual takes to infect a susceptible neighbor, both 
random variables following a Poisson process but with 
different constant rates. Since the infection uses the net- 
work as a template to spread, the process of propagation 
can be understood as a percolation problem over the orig- 
inal network where each edge is removed with probabil- 
ity qinf = 1 — Pinf, being pi^f the likelihood that an 
infected individual infects a susceptible neighbor before 
becoming removed. This mapping stands as an example 
of the importance of percolation theory beyond theoret- 
ical concerns. 

Percolation properties of random directed and undi- 
rected networks with given degree distributions and two- 
point correlations have been extensively studied fill 
[ij, llJ, llal • One of the most striking results, due to its 
important implications, is the absence of a percolation 



threshold in uncorrelated random SF networks 0, . 
In other words, in this type of networks, one has to re- 
move virtually the totality of their constituents before the 
network fragments into disconnected components. Trans- 
lated into the epidemic context, this means that an epi- 
demic threshold below which the epidemics cannot prop- 
agate does not exist. This result is particularly impor- 
tant due to the fact that a large number of real networks 
have a SF degree distribution. This result has also been 
generalized to the case of random SF networks with two- 
point correlations, both for the SIR model and for the 
susceptible-infected-susceptible (SIS) model of epidemic 
spreading 

Nevertheless, almost all the analytical results obtained 
up to date implicitly refer to networks without cluster- 
ing and little is known about its effects on the percola- 
tion properties of such networks, with the exception of 
Ref. |l8| , where an analytical solution for the percolation 
properties of the one-mode projection of random bipar- 
tite graphs was developed. See also 0. This is due 
to the fact that those analysis are based on the idea of 
branching process. This approach works well when the 
network is locally tree-like and, thus, the clustering coef- 
ficient is very small. Real networks, however, are shown 
to have a significant level of clustering that may change 
the percolation properties significantly. In this paper, 
we present analytical and simulation results for percola- 
tion in clustered networks. The analytical approximation 
becomes exact in the limit of weak clustering and simu- 
lations are also provided in the case of strong clustering. 
We find that clustering makes networks more fragmented 
as compared to the unclustered counterparts but with gi- 
ant components which have tighter interconnected cores 
of high-degree vertices. We also find that clustering can- 
not restore the percolation and epidemic thresholds in SF 
networks. 

To begin with, we follow Ref. [IDl and define the mul- 



tiplicity of an edge. 



as the number of triangles in 



which the edge connecting vertices i and j participates. 
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This quantity is the analog to the number of triangles 
attached to a node i, Ti, which is used to define the local 
clustering coefficient. In the coarse-grained level of de- 
gree classes, one can define the multiplicity matrix rrikk' 
as the average multiplicity of the edges connecting the 
classes k and k' . Then, the degree-dependent clustering 
coefficient c{k) -a property of vertices- and the multiplic- 
ity matrix rrikk' -a property of edges- are related through 
the following identity valid for any network 



J2mkk'P{k,k') = fc(fc- l)c(fc) 



m 
(fc) ■ 



(1) 



where P{k) is the degree distribution and P{k, fc') is the 
probability that one edge connects two vertices of degrees 
fc and fc'. The multiplicity matrix rukk' , which varies in 
the range [0,to|j./] with m|;./ = ■min{k,k') — 1, gives a 
more detailed description on how triangles are shared 
among vertices of different degrees and, as we shall see, 
it contains the relevant information to analyze the per- 
colation properties of clustered networks. 

An alternative way to quantify clusterin g is by using 
the edge clustering coefficient as defined in |2 ij 



c(fc,fc') 



mkk' 



min(k, fc') 



1 



(2) 



As in the case of the local clustering coefficient, c(fc, fc') 
also has a probabilistic interpretation. It quantifies the 
likelihood that a pair of connected vertices have a com- 
mon neighbor. If the network is random, we can assume 
that the probability that an edge connecting two vertices 
of degrees fc and fc' has multiplicity m is 



^(m|fcfc') 



m 



[c(fc,fc')]™[i-c(fc,fc'r"'-". (3) 



This probability, along with the multiplicity matrix, are 
crucial to compute correctly the percolation properties of 
clustered random networks due to the fact that, although 
we start from a given vertex and we follow all its edges 
as in the non-clustered case, once we are placed in one of 
the neighbors, we only follow those edges not pointing to 
the neighborhood of the source vertex so that we avoid 
edges responsible for clustering. It is worth noticing that, 
even in this scheme, we are neglecting the fact that higher 
order loops may be present. 

Let us start the analytical computations by defining 
the probability that a given vertex has s reachable ver- 
tices (including itself), G{s). For very heterogeneous net- 
works it is more convenient to define this probability con- 
ditioned to the degree of the source vertex, G{s\k), and 
then G(s) = J2k Finally, we need to in- 

troduce an extra function, g{s\k), which measures the 
probability that a vertex can reach s other vertices given 
that it is connected to a vertex v, of degree fc, and that it 
cannot visit neither v nor its neighborhood (this idea was 
used in |22l | to compute the number of second neighbors 



of a given vertex). This last condition guaranties that 
we do not overcount contributions due to triangles. The 
functions G(s|fc) and g{s\k) are related through 



G(s|fc) = 



g{si\k) ■ ■ ■g{sk\k)Ss^ 



l+si + --- + Sk- 



(4) 



We can find a recursion relation for g{s\k) taking into ac- 
count that now the branching process has the constraint 
that at each generation point we can only use the free 
edges to continue the exploration. In this case 



fc' rn 

9{si\k') ■ ■ ■ g{sk'Jk')Ss,i+si+---+s^, , (5) 



where fc^^, = fc'— to— 1. To simplify this equation we make 
use of the so-called generating function formalism and 
transform g{s\k) to the discrete Laplace space, g{z\k) = 
J2s ^''9{^W)i where Eq. lO becomes a closed equation for 
the function g(z|fc). 



)(z|fc^) = z^^P(fc'|fc)0(m|fc,fc') [g{z\k')] 



(6) 



The percolation transition takes place when Eq. lO, 
evaluated at z — 1, admits as a stable solution g{z — 
M^) — C(^) ^ Ij that is, there is a finite probability 
(1 — ^(fc)) that the branching process extends up to in- 
finity. To analyze the stability of Eq. jnj near the fixed 
point g{z = l|fc) ~ 1 we study a perturbative solution 
g{z — l|fc) « l + x(fc)e in the limit e ^ 0. From Eq. ®, 



^(fc) = - 1 - mkk')P{k'\k)x{k'), 



(7) 



using that rrikk' = X^m ^')- The transition be- 

tween the percolated and the fragmented phases is given 
by the properties of the matrix (fc' — 1 — mkk')P{k'\k), 
and, in particular, by its maximum eigenvalue A^. When 
Am > 1 the network is in the percolated phase in which 
a macroscopic fraction of the system becomes globally 
connected. In the opposite situation, the network is a set 
of small disconnected clusters. 

The simplest case of clustered network corresponds to 
"^fcfc' = TTio, with Too G [0,1]. In this situation, from 
Eq.Q one obtains c(fc) = co(fc — 1)"^, where cq is a 
function of toq to be determined. Hence, small degree 
nodes are highly clustered whereas high degree ones are 
less clustered. This specific form of c(fc) is particularly 
important since it represents the maximum level of clus- 
tering one can impose in a network without introducing 
at the same time degree-degree correlations. This will 
allow us to analyze the effect of triangles without any 
interference from two-point correlations. Hereafter, we 
will refer to levels of clustering below this threshold as 
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FIG. 1: Relative size of the giant component as a function 
of Co, c{k) = co/{k — 1), for different average degrees and an 
exponential degree distribution for networks generated with 
the algorithm of Ref. ^ (network size is iV = 10^). Solid 
lines correspond to the numerical solution of Eq. 0. In the 
case Co = 0, we recover the results of the configuration model. 



weak transitivity. In fact, two-point correlations can be 
totally avoided except for vertices of degree k — 1 -that 
do not participate in triangles- and must necessarily fol- 
low a different connection pattern. The clustering factor 
Co (mo) takes in this case the form 



Co (mo) = Too 



i-2i^ + P(i,l) 



(1 



Pin 



(8) 



The probability P(l,l) = x is the smallest solution of 
the following quadratic equation (the derivation will be 
given in a forthcoming publication) 



W , 2P(1) 



1 - {^y (fc) 



p2(i) 



= (9) 



where (0)' is the average of 0(O|fcfc') over the set of ver- 
tices of degrees larger than 1. Then, the maximum eigen- 
value of the matrix (fc' — 1 — mkk')Pik'\k) can be analyt- 
ically computed and so the percolation condition 



(fc(fc-l)) 



> (l-Hco(mo)) 



Too 



Co (too) 



(10) 



For very low clustering, we recover the well-known result 
for percolation in random networks. The immediate con- 
clusion seems to be that clustering changes the position 
of the critical point. However, in the case of SF networks, 
the left hand side of Ea. (|10|l diverges in the thermody- 
namic limit and, therefore, in SF networks weak transi- 
tivity is not able to restore a finite percolation threshold, 
and hence, a finite epidemic threshold. 

To check the accuracy of the present formalism, we 
generated clustered random networks using the algorithm 
introduced in Ref. |20|. We simulated networks of 10^ 
nodes with an exponential degree distribution and a clus- 
tering coefhcient c(fc) = co(fc — 1)^^. In Fig.^ we com- 
pare the relative size of the giant connected component, 
f;cc, as a function of cq with the numerical solution of 
the Eq.®. As it can be seen, the effect of clustering is 
to reduce the size of the giant connected component (in 
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FIG. 2: Relative size of the giant component as function of the 
global clustering C for scale- free networks with 7 = 3 and 7 = 
2.5 and strong transitivity. The giant components converge to 
a constant value independent of the level of clustering. Each 
point corresponds to a network of size A'^ = 10^. 



agreement with 0, 0|). The effect is so strong that, 
in networks with a moderate average degree, it can frag- 
ment completely the network when cg exceeds a critical 
value. In other cases, the reduction of the size can be 
more than 50%. For values of co G [0, 0.5], the agreement 
between our formalism and the numerical simulations is 
excellent. Beyond this point, our approximation slightly 
overestimates the gee's size. This is mainly due to the 
fact that in this regime, links of multiplicity larger than 
1 appear which, in turn, induces the presence of some 
loops of order four. 

We now turn our attention to the case of strong tran- 
sitivity^ which corresponds to functions c(fc) decaying 
slower than fc~^. In this case, clustering and two- 
point degree correlations are intimately coupled . An 
heuristic argument is as follows: if a vertex with a high 
degree has also a high clustering coefficient, many of its 
neighbors will be connected among them, which induces 
an assortative behavior. In other words, to generate ran- 
dom networks with strong transitivity we need to intro- 
duce some mechanism generating assortativity. However, 
it is not possible to obtain a perfect assortative pattern in 
SF networks for arbitrary large degrees (see Ref. [i^ for a 
detailed discussion) and, as a consequence, the maximum 
level of clustering is limited. The algorithm of Ref. 
has a free parameter which allows to control the assorta- 
tivity of the resulting network so that SF networks with 
high clustering can be generated. We quantify the level of 
clustering as C = (l-P(l))-i Y.k Pik)c{k), so that C is 
defined in the interval [0, 1]. In Fig. 13 we show the rela- 
tive size of the giant component as a function of C. As in 
the case of weak transitivity, clustering reduces the size of 
the giant component. However, after a certain value, the 
size of the giant component stabilizes to a constant value 
which is independent of C. Therefore, SF networks with 
high levels of clustering have giant components which are 
smaller than their counterparts in networks without clus- 
tering. But, which are the resilience properties of those 
giant components in front of random removal of edges? 
To answer this question, we have generated two SF net- 
works with 7 = 2.5, one with the maximum level of clus- 
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FIG. 3: Top: Relative size of the giant component in relation 
to the original size {Ng^c ~ 601353 for the unclustered net- 
work, and Ngcc ~ 125353 for the clustered one) as a function 
of the fraction of removed edges, q. SF nets with 7 = 2.5 and 
= 10^ are simulated for: t) a clustered network, C — 0.71 
(circles), ii) an unclustered one (squares), and lii) the ran- 
domized gcc of the clustered net. The inset shows a zoom of 
the area close to 5 = 1. Bottom: Relative sizes of the giant 
fc-cores for li) and lii) and the cumulative degree distribution. 

tering (C=0.71) and the other without clustering, and 
applied a random removal of edges on the corresponding 
giant components. The results are shown in Fig. |31 (top 
graph). The giant component of the clustered network 
turns out to be more resilient than the giant component 
of the unclustered one. Since SF networks without clus- 
tering does not have a percolation threshold, we conclude 
that clustering, even high, cannot restore the percolation 
and epidemic thresholds in random SF networks. 

However, the degree distributions of the giant con- 
nected components can be different, a fact that could 
explain the observed differences in the resilience prop- 
erties. To check this point, we have randomized the 
gcc of the clustered network while keeping fixed its de- 
gree distribution (see the curve labeled Randomized in 
the top of Fig. O. This network is more resilient than 
the clustered one for all levels of damage except for 
very high values, for which the gcc of the randomized 
network goes to zero faster due to finite size effects. 
This is illustrated in the inset of Fig. O The first ar- 
row indicates the threshold computed with the formula 
Qc ~ 1 — (k) / {k{k — 1)) = 0.986, whereas the second ar- 
row indicates the threshold due to finite size effects for 
the clustered net, which is placed closer to 1. Therefore, 
clustered networks are less sensitive to finite size effects 
than random equivalent ones. This can be understood 
analyzing the k-core decomposition of the networks (see 
[i^ l and references therein). The /c-core is the maximal 
subgraph such that all its nodes have k or more connec- 
tions within the subgraph, fn the bottom plot of Fig. O 
we show the relative size of the giant fc-core for both net- 



works. For small k, the randomized network has fc-cores 
which are bigger than the ones of the clustered net, which 
explains why it is more resilient. However, for very large 
degrees, the clustered network has bigger fc-cores, that 
is, it exists a small but finite core of vertices with very 
large degrees highly interconnected among them, which 
makes the network less prone to finite size effects. We 
also show the cumulative degree distribution Pc(fc), since 
it bounds the sizes of the fc-cores, which, for the clustered 
net, decays as a function of k with the same exponent. 

Summarizing, we have introduced a theoretical frame- 
work to analyze percolation properties of clustered net- 
works. We have shown that, although clustering strongly 
affects the percolation properties and the sizes of the 
giant components it cannot restore the percolation and 
epidemic thresholds in random SF networks, extending, 
thus, this important result to a wider class of networks, 
closer to the real ones. It is also worth to mention that 
these results can also be applied to other epidemiological 
models like the SIS model. 
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